Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early out in ObjectPool to avoid loop execution #1985

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

tomatosalat0
Copy link
Contributor

This is a micro optimization which caught my eye when profiling my own application. I haven't tested the impact on overall execution performance. But given that the pool is used at basically every function invocation, I assume that it won't hurt to change it. Given that the change doesn't make the code much harder to understand or maintain, I just did it.


Motivation

The object pool is only used internally and a few of them are allocated per Engine instance. The pool has basically "best effort" guarantees. Given that Engine itself is not declared as thread safe, no synchronization was used for the change.

The performance problem was mainly with empty or almost empty pools. In those cases, the pool always looped over its internal array to find a possible match. Given that the pool can easily track how many items are currently stored within the array, an early out mechanism was added.

Benchmark

Here are the results. The benchmark code is attached below.

GetFromEmpty:

Method Mean Error StdDev Gen0 Allocated
Old 82.46 ns 1.503 ns 1.333 ns 0.0253 424 B
New 49.75 ns 0.972 ns 1.157 ns 0.0258 432 B

GetFromFilledWithOne:

Method Mean Error StdDev Gen0 Allocated
Old 95.70 ns 1.226 ns 1.147 ns 0.0253 424 B
New 53.81 ns 1.053 ns 1.952 ns 0.0258 432 B

GetFromAlmostFull:

Method Mean Error StdDev Gen0 Allocated
Old 83.84 ns 1.189 ns 1.112 ns 0.0253 424 B
New 82.10 ns 0.813 ns 0.760 ns 0.0257 432 B

GetFromFullyFilled:

Method Mean Error StdDev Gen0 Allocated
Old 97.51 ns 1.073 ns 0.951 ns 0.0267 448 B
New 91.66 ns 0.930 ns 0.870 ns 0.0272 456 B
Benchmark Code

The benchmark always fetches 12 items from the pool. The overhead of allocating and
filling the pool is not relevant because the same code is used for both, before and after.

[MemoryDiagnoser]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory, BenchmarkLogicalGroupRule.ByParams)]
public class ObjectPoolBenchmark
{
    [Benchmark()]
    public void GetFromEmpty()
    {
        var pool = new ObjectPool<object>(Allocator, size: 10);
        AllocateItems(pool);
    }

    [Benchmark()]
    public void GetFromFilledWithOne()
    {
        var pool = new ObjectPool<object>(Allocator, size: 10);
        pool.Free(Allocator());
        AllocateItems(pool);
    }

    [Benchmark()]
    public void GetFromAlmostFull()
    {
        var pool = new ObjectPool<object>(Allocator, size: 10);
        for (int i = 0; i < 9; i++)
            pool.Free(Allocator());

        AllocateItems(pool);
    }

    [Benchmark()]
    public void GetFromFullyFilled()
    {
        var pool = new ObjectPool<object>(Allocator, size: 10);
        for (int i = 0; i < 11; i++)
            pool.Free(Allocator());

        AllocateItems(pool);
    }

    private static void AllocateItems(ObjectPool<object> pool)
    {
        for (int i = 0; i < 12; i++)
            pool.Allocate();
    }

    private static object Allocator()
        => new object();
}

@lahma
Copy link
Collaborator

lahma commented Oct 24, 2024

Great find and extensive benchmarking! The current implementation was based on old object pool code and since that it has been improved, I wonder if changes in dotnet/aspnetcore#45251 would reflect well on Jint side too...

@tomatosalat0
Copy link
Contributor Author

tomatosalat0 commented Oct 24, 2024

Good idea! I tried to use it but the results do not look great.

Microsoft.Extensions.ObjectPool:

Method Mean Error StdDev Gen0 Gen1 Allocated
GetFromEmpty 171.0 ns 3.08 ns 2.88 ns 0.0811 0.0002 1.33 KB
GetFromFilledWithOne 171.5 ns 2.58 ns 2.29 ns 0.0811 0.0002 1.33 KB
GetFromAlmostFull 220.2 ns 3.86 ns 3.61 ns 0.0811 0.0002 1.33 KB
GetFromFullyFilled 223.5 ns 1.98 ns 1.66 ns 0.0811 0.0002 1.33 KB

Which is not a big surprise because their pool does support multi threading, ours doesn't.
However, I took a look into their implementation and tried to replicate it - without concurrency support:

  • Usage of System.Collections.Generic.Queue<T> instead of the array with a pre-allocated capacity
  • Ensuring that the queue never gets more than N items.
  • Keep the usage of _firstItem as it is what they have done as well

Well, the results were better, but not as good as the PR version.

Queue<T> and _firstItem:

Method Mean Error StdDev Gen0 Allocated
GetFromEmpty 60.51 ns 1.228 ns 1.088 ns 0.0291 488 B
GetFromFilledWithOne 63.87 ns 1.285 ns 2.537 ns 0.0291 488 B
GetFromAlmostFull 97.33 ns 0.769 ns 0.719 ns 0.0291 488 B
GetFromFullyFilled 95.53 ns 0.806 ns 0.715 ns 0.0291 488 B

In summary: I think the stupid simple version of the Pool within the code is actually not that bad.

Copy link
Collaborator

@lahma lahma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking the time and extra check against the MS version, much appreciated!

@lahma lahma merged commit 6cb6d5d into sebastienros:main Oct 24, 2024
3 checks passed
@tomatosalat0 tomatosalat0 deleted the optimize-object-pool branch October 24, 2024 16:14
@sebastienros
Copy link
Owner

Pretty sure I have this pool implementation in other projects ... since it's taken from the dotnet runtime there might be a PR to do there also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants